Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use nvbench entropy stopping #15229

Closed

Conversation

PointKernel
Copy link
Member

@PointKernel PointKernel commented Mar 5, 2024

Description

NVIDIA/nvbench#151 introduced a new entropy stopping criterion.

TBD

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@PointKernel PointKernel added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 5, 2024
@PointKernel
Copy link
Member Author

distinct_inner_join_32bit results with different stopping criteria:

  1. The current default stdrel used by libcudf:
Key Type Payload Type Nullable Build Table Size Probe Table Size Samples CPU Time Noise GPU Time Noise
I32 I32 0 100000 100000 7888x 67.146 us 15.20% 63.394 us 13.57%
I32 I32 0 100000 400000 2864x 178.145 us 5.54% 174.591 us 5.08%
I32 I32 0 10000000 10000000 41x 12.418 ms 0.12% 12.415 ms 0.12%
I32 I32 0 10000000 40000000 13x 38.947 ms 0.04% 38.943 ms 0.04%
I32 I32 0 10000000 100000000 11x 89.417 ms 0.03% 89.415 ms 0.03%
  1. The predefined entropy:
Key Type Payload Type Nullable Build Table Size Probe Table Size Samples CPU Time Noise GPU Time Noise
I32 I32 0 100000 100000 706x 72.673 us 6.73% 68.619 us 3.32%
I32 I32 0 100000 400000 1144x 184.809 us 5.67% 180.956 us 5.10%
I32 I32 0 10000000 10000000 958x 12.405 ms 0.14% 12.402 ms 0.13%
I32 I32 0 10000000 40000000 385x 38.999 ms 0.09% 38.996 ms 0.09%
I32 I32 0 10000000 100000000 168x 89.504 ms 0.05% 89.501 ms 0.05%
  1. The custom entropy (max-samples = 20):
Key Type Payload Type Nullable Build Table Size Probe Table Size Samples CPU Time Noise GPU Time Noise
I32 I32 0 100000 100000 20x 74.243 us 7.58% 70.088 us 4.47%
I32 I32 0 100000 400000 20x 193.101 us 2.52% 188.915 us 1.09%
I32 I32 0 10000000 10000000 20x 12.694 ms 2.48% 12.689 ms 2.48%
I32 I32 0 10000000 40000000 20x 39.007 ms 0.07% 39.004 ms 0.07%
I32 I32 0 10000000 100000000 20x 89.468 ms 0.03% 89.465 ms 0.02%

@PointKernel
Copy link
Member Author

Closing this since the entropy stopping is not what we want to reduce nightly benchmark run time.

@PointKernel PointKernel closed this May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant